Convergence rates of the Voting Gibbs classifier, with application to Bayesian feature selection
نویسندگان
چکیده
The Gibbs classiier is a simple approximation to the Bayesian optimal classiier in which one samples from the posterior for the parameter , and then classiies using the single classiier indexed by that parameter vector. In this paper, we study the Voting Gibbs classiier, which is the extension of this scheme to the full Monte Carlo setting, in which N samples are drawn from the posterior and new inputs are classiied by voting the N resulting classiiers. We show that the error of Voting Gibbs converges rapidly to the Bayes optimal rate; in particular the relative error decays at a rapid O(1=N) rate. We also discuss the feature selection problem in the Voting Gibbs context. We show that there is a choice of prior for Voting Gibbs such that the algorithm has high tolerance to the presence of irrelevant features. In particular, the algorithm has sample complexity that is logarithmic in the number of irrelevant features.
منابع مشابه
A New Hybrid Framework for Filter based Feature Selection using Information Gain and Symmetric Uncertainty (TECHNICAL NOTE)
Feature selection is a pre-processing technique used for eliminating the irrelevant and redundant features which results in enhancing the performance of the classifiers. When a dataset contains more irrelevant and redundant features, it fails to increase the accuracy and also reduces the performance of the classifiers. To avoid them, this paper presents a new hybrid feature selection method usi...
متن کاملDiagnosis of Breast Cancer Subtypes using the Selection of Effective Genes from Microarray Data
Introduction: Early diagnosis of breast cancer and the identification of effective genes are important issues in the treatment and survival of the patients. Gene expression data obtained using DNA microarray in combination with machine learning algorithms can provide new and intelligent methods for diagnosis of breast cancer. Methods: Data on the expression of 9216 genes from 84 patients across...
متن کاملBayesian Inference of (Co) Variance Components and Genetic Parameters for Economic Traits in Iranian Holsteins via Gibbs Sampling
The aim of this study was using Bayesian approach via Gibbs sampling (GS) for estimating genetic parameters of production, reproduction and health traits in Iranian Holstein cows. Data consisted of 320666 first- lactation records of Holstein cows from 7696 sires and 260302 dams collected by the animal breeding center of Iran from year 1991 to 2010. (Co) variance components were estimated using ...
متن کاملFeature selection using genetic algorithm for breast cancer diagnosis: experiment on three different datasets
Objective(s): This study addresses feature selection for breast cancer diagnosis. The present process uses a wrapper approach using GA-based on feature selection and PS-classifier. The results of experiment show that the proposed model is comparable to the other models on Wisconsin breast cancer datasets. Materials and Methods: To evaluate effectiveness of proposed feature selection method, we ...
متن کاملEffective Feature Selection for Pre-Cancerous Cervix Lesions Using Artificial Neural Networks
Since most common form of cervical cancer starts with pre-cancerous changes, a flawless detection of these changes becomes an important issue to prevent and treat the cervix cancer. There are 2 ways to stop this disease from developing. One way is to find and treat pre-cancers before they become true cancers, and the other is to prevent the pre-cancers in the first place. The presented approach...
متن کامل